145 research outputs found

    The geopolitics behind the routes data travels: a case study of Iran

    Full text link
    The global expansion of the Internet has brought many challenges to geopolitics. Cyberspace is a space of strategic priority for many states. Understanding and representing its geography remains an ongoing challenge. Nevertheless, we need to comprehend Cyberspace as a space organized by humans to analyse the strategies of the actors. This geography requires a multidisciplinary dialogue associating geopolitics, computer science and mathematics. Cyberspace is represented as three superposed and interacting layers: the physical, logical, and informational layers. This paper focuses on the logical layer through an analysis of the structure of connectivity and the Border Gateway Protocol (BGP). This protocol determines the routes taken by the data. It has been leveraged by countries to control the flow of information, and to block the access to contents (going up to full disruption of the internet) or for active strategic purposes such as hijacking traffic or attacking infrastructures. Several countries have opted for a BGP strategy. The goal of this study is to characterize these strategies, to link them to current architectures and to understand their resilience in times of crisis. Our hypothesis is that there are connections between the network architecture shaped through BGP, and strategy of stakeholders at a national level. We chose to focus on the case of Iran because, Iran presents an interesting BGP architecture and holds a central position in the connectivity of the Middle East. Moreover, Iran is at the center of several ongoing geopolitical rifts. Our observations make it possible to infer three ways in which Iran could have used BGP to achieve its strategic goals: the pursuit of a self-sustaining national Internet with controlled borders; the will to set up an Iranian Intranet to facilitate censorship; and the leverage of connectivity as a tool of regional influence

    Best Basis for Joint Representation: the Median of Marginal Best Bases for Low Cost Information Exchanges in Distributed Signal Representation

    Get PDF
    International audienceThe paper addresses the selection of the best representations for distributed and/or dependent signals. Given an indexed tree structured library of bases and a semi-collaborative distribution scheme associated with minimum information exchange (emission and reception of one single index corresponding to a marginal best basis), the paper proposes the median basis computed on a set of best marginal bases for joint representation or fusion of distributed/dependent signals. The paper provides algorithms for computing this median basis with respect to standard tree structured libraries of bases such as wavelet packet bases or cosine trees. These algorithms are effective when an additive information cost is under consideration. Experimental results performed on distributed signal compression confirms worthwhile properties for the median of marginal best bases with respect to the ideal best joint basis, the latter being underdetermined in practice, except when a full collaboration scheme is under consideration

    A Signal-Processing View on Packet Sampling and Anomaly Detection

    Get PDF
    International audienceAnomaly detection methods typically operate on preprocessed traffic traces. Firstly, most traffic capturing devices today employ random packet sampling, where each packet is selected with a certain probability, to cope with increasing link speeds. Secondly, temporal aggregation, where all packets in a measurement interval are represented by their temporal mean, is applied to transform the traffic trace to the observation timescale of interest for anomaly detection. These preprocessing steps affect the temporal correlation structure of traffic that is used by anomaly detection methods such as Kalman filtering or PCA, and have thus an impact on anomaly detection performance. Prior work has analyzed how packet sampling degrades the accuracy of anomaly detection methods; however, neither theoretical explanations nor solutions to the sampling problem have been provided. This paper makes the following key contributions: (i) It provides a thorough analysis and quantification of how random packet sampling and temporal aggregation modify the signal properties by introducing noise, distortion and aliasing. (ii) We show that aliasing introduced by the aggregation step has the largest impact on the correlation structure. (iii) We further propose to replace the aggregation step with a specifically designed low-pass filter that reduces the aliasing effect. (iv) Finally, we show that with our solution applied, the performance of anomaly detection systems can be considerably improved in the presence of packet sampling

    Faving Reciprocity in Content Sharing Communities A comparative analysis of Flickr and Twitter

    Get PDF
    International audienceIn the Web 2.0 era, users share and discover interesting content via a network of relationships created in various social networking or content sharing sites. They can become for example contacts, followers or friends and express their appreciation of specific content uploaded by their peers by faving, retweeting or liking them depending on whether they are in Flickr, Twitter or Facebook respectively. Then they can discover additional content of interest through the lists of favorites of their contacts and so on. This faving (or favoring) functionality becomes thus a central part of content sharing communities for two purposes: (a) it helps the propagation of content amongst users and (b) it stimulates users' participation and activity. In this paper, we make a first step to understand users' faving behavior in content sharing communities in terms of reciprocity using publicly available datasets from Flickr and Twitter. Do users favor content only when they really appreciate it or they often feel the need to reciprocate when their content is appreciated by one of their contacts or even by a stranger? Do people take advantage of this process to gain popularity? What is the impact of the design, the social software, of a specific community and the type of content shared? These are some of the questions that our first results help to answer

    Applying PCA for Traffic Anomaly Detection: Problems and Solutions

    Get PDF
    International audienceSpatial Principal Component Analysis (PCA) has been proposed for network-wide anomaly detection. A recent work has shown that PCA is very sensitive to calibration settings. Unfortunately, the authors did not provide further explanations for this observation. In this paper, we fill this gap and provide the reasoning behind the found discrepancies. We revisit PCA for anomaly detection and evaluate its performance on our data. We develop a slightly modified version of PCA that uses only data from a single router. Instead of correlating data across different spatial measurement points, we correlate the data across different metrics. With the help of the analyzed data, we explain the pitfalls of PCA and underline our argumentation with measurement results. We show that the main problem is that PCA fails to capture temporal correlation. We propose a solution to deal with this problem by replacing PCA with the Karhunen-Loeve transform. We find that when we consider temporal correlation, anomaly detection results are significantly improved

    An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors

    Get PDF
    International audienceIn this paper, we propose a methodology to predict the popularity of online contents. More precisely, rather than trying to infer the popularity of a content itself, we infer the likelihood that a content will be popular. Our approach is rooted in survival analysis where predicting the precise lifetime of an individual is very hard and almost impossible but predicting the likelihood of one's survival longer than a threshold or another individual is possible. We position ourselves in the standpoint of an external observer who has to infer the popularity of a content only using publicly observable metrics, such as the lifetime of a thread, the number of comments, and the number of views. Our goal is to infer these observable metrics, using a set of explanatory factors, such as the number of comments and the number of links in the first hours after the content publication, which are observable by the external observer. We use a Cox proportional hazard regression model that di- vides the distribution function of the observable popularity metric into two components: a) one that can be explained by the given set of explanatory factors (called risk factors) and b) a baseline distribution function that integrates all the factors not taken into account. To validate our proposed approach, we use data sets from two different online discussion forums: dpreview.com, one of the largest online discussion groups providing news and discussion forums about all kinds of digital cameras, and myspace.com, one of the representative online social networking services. On these two data sets we model two different popularity metrics, the lifetime of threads and the number of comments, and show that our approach can predict the lifetime of threads from Dpreview (Myspace) by observing a thread during the first 5∼6 days (24 hours, respectively) and the number of comments of Dpreview threads by observing a thread during first 2∼3 days

    Practical Bloom filter based epidemic forwarding and congestion control in DTNs: A comparative analysis

    Get PDF
    International audienceEpidemic forwarding has been proposed as a forwarding technique to achieve opportunistic communication in delay tolerant networks (DTNs). Even if this technique is well known and widely referred, one has to address several practical problems before using it. Unfortunately, while the literature on DTNs is full of new techniques, very little has been done in comparing them. In particular, while Bloom filters have been proposed to exchange information about the buffer content prior to sending information in order to avoid redundant retransmissions, up to our knowledge no real evaluation has been provided to study the tradeoffs that exist for using Bloom filters in practice. A second practical issue in DTNs is buffer management (resulting from finite buffers) and congestion control (resulting from greedy sources). This has also been the topic of several papers that had already uncovered the difficulty to acquire accurate information mandatory to regulate the data transmission rates and buffer space. In this paper, we fill this gap. We have been implementing a simulation of different proposed congestion control schemes for epidemic forwarding in ns-3 environment. We use this simulation to compare different proposed schemes and to uncover issues that remain in each one of them. Based on this analysis, we proposed some strategies for Bloom filter management based on windowing and describe implementation tradeoffs. Afterwards, we propose a back-pressure rate control as a well as an aging based buffer managing solution to deal with congestion control. By simulating our proposed mechanisms in ns-3 both with random-waypoint mobility and realistic mobility traces coming from San-Francisco taxicabs, we show that the proposed mechanisms alleviate the challenges of using epidemic forwarding in DTN

    Evaluating and Optimizing IP Lookup on Many core Processors

    Get PDF
    International audienceIn recent years, there has been a growing interest in multi/many core processors as a target architecture for high performance software router. Because of its key position in routers, hardware IP lookup implementation has been intensively studied with TCAM and FPGA based architecture. However, increasing interest in software implementation has also been observed. In this paper, we evaluate the performance of software only IP lookup on a many core chip, the TILEPro64 processor. For this purpose we have implemented two widely used IP lookup algorithms, DIR-24-8-BASIC and Tree Bitmap. We evaluate the performance of these two algorithms over the TILEPro64 processor with both synthetic and real-world traces. After a detailed analysis, we propose a hybrid scheme which provides high lookup speed and low worst case update overhead. Our work shows how to exploit the architectural features of TILEPro64 to improve the performance, including many optimization in both single-core and parallelism aspects. Experiment results show by using only 18 cores, we can achieve a lookup throughput of 60Mpps (almost 40Gbps) with low power consumption, which demonstrates great performance potentials in many core processor

    Modeling and Predicting the Popularity of Online Contents with Cox Proportional Hazard Regression Model

    Get PDF
    Special Issue on Advances in Web IntelligenceInternational audienceWe propose a general framework which can be used for modeling and predicting the popularity of online contents. The aim of our modeling is not inferring the precise popularity value of a content, but inferring the likelihood where the content will be popular. Our approach is rooted in survival analysis which deals with the survival time until an event of a failure or death. Survival analysis assumes that predicting the precise lifetime of an instance is very hard but predicting the likelihood of the lifetime of an instance is possible based on its hazard distribution. Additionally we position ourselves in the standpoint of an external observer who has to model the popularity of contents only with publicly available information. Thus, the goal of our proposed methodology is to model a certain popularity metric, such as the lifetime of a content and the number of comments which a content receives, with a set of explanatory factors, which are observable by the external observer. Among various parametric and non-parametric approaches for the survival analysis, we use the Cox proportional hazard regression model, which divides the distribution function of a certain popularity metric into two components: one which is explained by a set of explanatory factors, called risk factors, and another, a baseline survival distribution function, which integrates all the factors not taken into account. In order to validate our proposed methodology, we use two datasets crawled from two di erent discussion forums, forum.dpreview.com and forums.myspace.com, which are one of the largest discussion forum dealing various issues on digital cameras and a discussion forum provided by a representative social networks. We model two di erence popularity metrics, the lifetime of threads and the number of comments, and we show that the models can predict the lifetime of threads from Dpreview (Myspace) by observing a thread during the first 5 6 days (24 hours, respectively) and the number of comments of Dpreview threads by observing a thread during first 2 3 days
    corecore